Efficient AUC Optimization for Classification

نویسندگان

  • Toon Calders
  • Szymon Jaroszewicz
چکیده

In this paper we show an efficient method for inducing classifiers that directly optimize the area under the ROC curve. Recently, AUC gained importance in the classification community as a mean to compare the performance of classifiers. Because most classification methods do not optimize this measure directly, several classification learning methods are emerging that directly optimize the AUC. These methods, however, require many costly computations of the AUC, and hence, do not scale well to large datasets. In this paper, we develop a method to increase the efficiency of computing AUC based on a polynomial approximation of the AUC. As a proof of concept, the approximation is plugged into the construction of a scalable linear classifier that directly optimizes AUC using a gradient descent method. Experiments on real-life datasets show a high accuracy and efficiency of the polynomial approximation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Turning the hyperparameter of an AUC-optimized classifier

The Area under the ROC curve (AUC) is a good alternative to the standard empirical risk (classification error) as a performance criterion for classifiers. While most classifier formulations aim at minimizing the classification error, few methods exist that directly optimize the AUC. Moreover, the reported methods that optimize the AUC are often not efficient even for moderately sized datasets. ...

متن کامل

Increasing the accuracy of the classification of diabetic patients in terms of functional limitation using linear and nonlinear combinations of biomarkers: Ramp AUC method

The Area under the ROC Curve (AUC) is a common index for evaluating the ability of the biomarkers for classification. In practice, a single biomarker has limited classification ability, so to improve the classification performance, we are interested in combining biomarkers linearly and nonlinearly. In this study, while introducing various types of loss functions, the Ramp AUC method and some of...

متن کامل

A Structural SVM Based Approach for Optimizing Partial AUC

The area under the ROC curve (AUC) is a widely used performance measure in machine learning. Increasingly, however, in several applications, ranging from ranking and biometric screening to medical diagnosis, performance is measured not in terms of the full area under the ROC curve, but instead, in terms of the partial area under the ROC curve between two specified false positive rates. In this ...

متن کامل

Efficient Data Mining with Evolutionary Algorithms for Cloud Computing Application

With the rapid development of the internet, the amount of information and data which are produced, are extremely massive. Hence, client will be confused with huge amount of data, and it is difficult to understand which ones are useful. Data mining can overcome this problem. While data mining is using on cloud computing, it is reducing time of processing, energy usage and costs. As the speed of ...

متن کامل

Directly and Efficiently Optimizing Prediction Error and AUC of Linear Classifiers

The predictive quality of machine learning models is typically measured in terms of their (approximate) expected prediction error or the socalled Area Under the Curve (AUC) for a particular data distribution. However, when the models are constructed by the means of empirical risk minimization, surrogate functions such as the logistic loss are optimized instead. This is done because the empirica...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007